• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

Ȩ Ȩ > ¿¬±¸¹®Çå >

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) Channel-wise Mixed precision considering quantization sensitivity and layer characteristics
¿µ¹®Á¦¸ñ(English Title) Channel-wise Mixed precision considering quantization sensitivity and layer characteristics
ÀúÀÚ(Author) Yongjoo Lee   Mincheal Kang   Jiwon Seo  
¿ø¹®¼ö·Ïó(Citation) VOL 49 NO. 02 PP. 0580 ~ 0582 (2022. 12)
Çѱ۳»¿ë
(Korean Abstract)
¿µ¹®³»¿ë
(English Abstract)
Quantization is one of the model compression techniques to reduce computation and memory footprint efficiently and improve inference speed by representing a continuous real number as a discrete value. Recently, research has been conducted to find an appropriate bitwidth by analyzing layer-wise or channel-wise characteristics with reinforcement learning based on a mixed precision technique that uses two or more types of bits mixed instead of a single bit. Applying reinforcement learning can minimize quantization errors to maximize performance but requires high costs for exploring numerous bit combinations. In this paper, using two data types, INT4 and INT8, 1) the bit combination is updated while gradually increasing the ratio of INT8 data type through sensitivity analysis on quantization for each channel of the convolution layer, 2) QAT (Quantization Aware Training) ) technique to keep model performance degradation to a minimum. As a result of the experiment, the optimal bit combination was found through a searching process of up to 20 times, and the performance was maintained as in the case of setting all convolution layer data types as INT8.
Å°¿öµå(Keyword)
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå